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Abstract 


This  paper  considers  the  problem  of  determining  the  mean  and  distri¬ 
bution  of  the  length  of  a  minimal  spanning  tree  (MST)  on  an  undi¬ 
rected  graph  whose  arc  lengths  are  independently  distributed  random 
variables.  We  obtain  bounds  and  approximations  for  the  MST  length 
and  show  that  our  upper  bound  is  much  tighter  than  the  naive  bound 
obtained  by  computing  the  MST  length  of  the  deterministic  graph  with 
the  respective  means  as  arc  lengths.  We  analyze  the  asymptotic  prop¬ 
erties  of  our  approximations  and  establish  conditions  under  which  our 
bounds  are  asymptotically  optimal.  We  apply  these  results  to  a  network 


provisioning  problem  and  show  that  the  relative  error  induced  by  using 


our  approximations  tends  to  zero  as  the  graph  grows  leu-ge. 


A  spanning  tree  is  a  connected,  acyclic  subgraph  that  spans  (i.e.,  includes)  all 
the  nodes  of  a  given  graph  (see  Harary  [1969];  Lawler  [1976]  for  definitions  and 
properties).  A  minimal  (maximal)  spanning  tree  is  a  spanning  tree  that  has  the 
minimum  (meiximum)  total  edge  weight,  the  minimum  (maximum)  being  taken  over 
the  set  of  all  spanning  trees  of  the  given  graph. 

We  consider  the  problem  of  determining  the  probability  distribution  function 
(DF)  and  expected  value  of  the  length  of  a  minimal  spanning  tree  (MST)  for  an 
undirected  graph  with  random  arc  lengths.  An  obvious  approximation  to  the  ex¬ 
pected  length  of  a  MST  in  random  graphs  is  obtained  by  replacing  the  arc-length 
random  variables  by  their  respective  expectations  and  computing  the  MST  length 
for  the  resulting  deterministic  graph.  This  approximation  is  an  upper  bound  for  the 
true  expected  length  of  MST.  Our  approximation  provides  a  tighter  bound.  Our 
approach  also  enables  us  to  approximate  the  DF  of  the  MST  length,  and  we  prove 
that  our  approximation  is  asymptotically  precise  with  probability  one. 

A  classical  application  for  the  minimal  spanning  tree  is  the  design  of  communi¬ 
cation  networks.  As  an  application  of  our  results  we  consider  a  probabilistic  version 
of  the  network  design  problem.  An  order  is  to  be  placed  for  material  to  construct 
a  communication  network  (a  minimal  spanning  tree).  At  the  time  that  the  order 
is  placed,  the  arc  lengths  are  not  known  with  certainty.  Over-  or  under-supply  of 
material  results  in  increased  costs.  We  seek  an  order  level  which  minimizes  the 
expected  total  cost.  Our  approach  will  provide  asymptotically  optimal  solutions  to 
this  problem. 

The  organization  of  this  paper  is  as  follows.  Section  1  introduces  the  stochastic 
spanning  tree  problem  and  our  approach  to  its  approximation.  In  section  2  we 
present  some  Monte  Carlo  results  as  well  as  an  analytical  chwacterization  of  the 
accuracy  of  our  approximation.  Section  3  proceeds  to  establish  some  asymptotic 
properties  of  our  bounds,  and  section  4  presents  an  application  of  these  bounds 
to  the  network  provisioning  problem  and  results  pertaining  to  their  asymptotic 
optimality.  Finally,  section  5  concludes  the  paper  with  some  remarks  on  open 
questions. 
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I  The  Stochastic  Spanning  Tree  Problem 

The  stochastic  MST  problem  can  be  formulated  in  several  ways.  Gilbert  [1965] 
considered  the  problem  of  constructing  minimal  spzmning  trees  to  connect  n  points 
placed  at  random  in  the  unit  circle  ||P||  <  1  according  to  a  Poisson  process,  where 

II  •  II  represents  the  “distance”  of  a  point  from  the  origin.  Gilbert  considered  three 
different  norms  (Euclidean,  Manhattan,  and  maximum)  on  Cartesian  surfaces  and 
obtained  asymptotic  bounds  on  the  expected  length  of  the  MST.  For  all  choices  of 
the  norm,  he  estimated  this  length  to  be  asymptotic  to  c{iTn)^  and  showed  that 
the  constant  c  is  less  than  2“ a.  Steele  [1981|  considered  growth  rates  of  MST’s  on 
n  points  distributed  in  a  Euclidean  space,  and  Lueker  [1981]  obtained  asymptotic 
results  on  expected  lengths  of  maximum  spanning  trees  when  arc  lengths  drawn 
independently  from  the  unit  normal  distribution.  In  a  remarkable  paper.  Frieze 
[1985]  recently  established  that  when  arc  lengths  on  a  complete  graph  are  non¬ 
negative  i.i.d  random  variables  having  a  common  DF  F,  which  is  differentiable  at 
zero  with  F'(0)  =  /)  >  0,  and  has  finite  mean  and  variance,  then  the  length  of 
the  MST  tends  to  ?(3)/P  in  expectation  as  well  as  in  probability,  where  f(3)  = 

Er=i  ]^  =  1.202.... 

The  stochastic  framework  we  posit  for  our  problem  is  close  in  spirit  to  the  clas¬ 
sical  MST  problem  in  the  deterministic  case:  the  graph  structure,  ».«.  the  configu¬ 
ration  of  the  nodes  and  arcs,  is  considered  given;  only  the  arc-lengths  are  random 
variables,  each  with  its  own  probability  distribution. 

Let  G  —  {N,  A]  denote  a  graph  with  N  being  the  set  of  nodes,  and  A  C  N  x  N 
being  the  set  of  undirected  pairs  of  nodes  called  2U'cs.  We  shall  always  label  the 
nodes  of  G  from  1  through  n,  where  n  =  [W|. 

Let  Xij  denote  the  random  variable  representing  the  length  of  the  arc  between 
node  i  and  node  j  (since  the  arcs  are  undirected,  we  make  no  distinction  between 
Xij  and  Xji),  and  let  Fij  denote  the  distribution  function  of  X,y.  The  X,-,’s  are 
assumed  independent. 

Let  T  denote  the  class  of  all  spanning  trees  of  G.  By  Cayley’s  formula  (see 
Riordan  [1978]),  if  G  is  complete  (i.«.  if  all  links  are  permitted),  then 

|T|  =  n’-». 
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In  general,  let  jT  |  =  Af  <  n""*.  Then,  T  =  where  each  7<  in  j4  is  a  spanning 

tree  of  G.  We  shall  assume  that  G  is  connected. 

Let 

where 

Yi=  Y.  ■*■;*  (2) 

is  the  length  of  the  tth  spanning  tree.  It  is  clear  that  Yi  and  ^  are  well  defined 
random  variables.  The  characterization  of  the  random  variable  ^  is,  in  general, 
quite  difficult.  Our  attention  will  be  focused  on  obtaining  approximations  to  i  and 
its  distribution  function  . 

The  obvious  approximation  to  the  expected  length  of  the  MST  is  obtained  by 
replacing  the  random  variables  by  their  expectations  and  then  constructing  the 
MST.  Let 

S  =  min{£;v;}  =  min{  ^  EXj^}  (3) 

T.€T  T.-6T 

represent  this  approximation.  It  may  be  trivially  established  that  g  does  provide 
an  upper  bound,  i.e., 

Ei  <  j.  (4) 

The  value  of  g  can  be  easily  computed  by  any  of  the  several  “greedy”  algorithms 
(Kruskal  [1956];  Prim  [1957],  Loberman  and  Weinberg  [1957]). 

We  shall  use  the  following  version  of  the  Prim  algorithm  (Aho  tt  al.  [1983], 
p.235]  to  construct  an  analytic  device  which  we  shall  use  to  obtain  a  tighter  bound 
for  E^.  The  algorithm  is  described  in  pseudo-Pascal  notation. 
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Procedure  PRIM  (G(A/',  A):graph;  var  5:set  of  arcs; 

var  9:lengtb;  a;:arc  weights); 

{PRIM  constructs  an  MST  for  G,  the  nodes  in  N  lire  numbered  from  1  to  n} 
var 

U  :  set  of  nodes; 
i,y  :  vertex; 

PRIMNUM  :  aiTay[l  ..  n]  of  integer; 
begin 
q  :=  0; 

S:=<t>\ 

U  :=  {1}  ; 

PRIMNUM(l)  :=  1; 
k  2; 

while  U  ^  A  do 
begin 

find  (i,y)  such  that  to.y  =  mi^  (wuk); 

PRIMNUM(fc)  :=  j; 

5  :=  5  U  {(.-,>)}; 

U:=Uu{jh 
q  :=  q  + 
k  :=  k  +  1 
end 

end;  {PRIM} 


The  algorithm  PRIM  constructs  a  MST  with  respect  to  the  set  of 

arc  weights  zmd  stores  the  MST  arcs  in  the  set  5.  In  variable  q  it  computes  the 
“length”  of  the  MST  so  obtained.  It  also  records  in  the  zurray  PRIMNUM,  the  node 
numbers  of  the  nodes  of  G  in  the  order  they  were  selected  for  inclusion  in  the  set  U. 

We  shall  make  use  of  PRIM  to  generate  a  new  numbering  of  the  nodes  of  G 
w.r.t  a  given  set  {tw$7}(ij)gA>  according  to  the  order  in  which  they  were  included 
in  the  minimal  spanning  tree.  Let  this  new  numbering  of  nodes  be  called  the 
Prim-numbering  of  G  w.r.t  {tUiy}(,j)gA*  It  then  follows  from  this  definition  that  if 
{l',2', . . .  ,n'}  represents  a  Prim-nximbering  of  G  with  respect  to  arc  weights  EXij, 


then 


g=^  min  {EXiiA. 


(5) 


We  shall  henceforth  assume  that  G  is  Prim-numbered  w.r.t  {^X,y}(,j)gA* 

The  following  analytic  device,  motivated  by  Gilbert’s  paper  [1965],  enables  us  to 
obtain  a  better  bound  for  E(.  Consider  a  spanning  tree  (not  necessarily  minimal) 
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constructed  for  a  given  u;  (».e.,  for  a  given  realization  of  the  graph),  as  follows: 
For  i  =  2, 3, . . . ,  n  choose  the  arc  (i, i)  such  that 


Xij{u)  =  inin  {X<fc(u;)}. 

It  is  easily  shown  by  induction  on  t  that  for  each  w,  the  n  —  1  arcs  chosen  as  above 
form  a  spanning  tree  (Gilbert  1965).  We  shall  refer  to  this  tree  as  an  exodic  tree  (the 
term  is  coined  by  Gilbert,  who  explains  its  interpretation  on  p.378  of  his  paper). 
Define  W,-  =  min{X,i, . . . ,  for  t  =  2, . . . , n,  and 

Z  =  E»'i=f:min{X„}.  (6) 

»=2  t=2 

Then  Z  is  a  random  variable  that  represents  the  length  of  the  exodic  tree.  Z  is 
uniquely  defined  for  a  given  numbering  of  the  nodes  of  G  and  is  simpler  to  analyze 
than  Indeed  it  is  easy  to  show  that  EZ  improves  on  as  an  approximation  of  E^. 

Theorem  1.  If  G  is  Prim-numbered  w.r.t 

Ei<EZ  <  g. 


Proof :  The  first  inequality  is  trivial  because  by  construction,  ^  <  Z  almost  surely. 
Now,  from  (6) 

BZ  =  E±mm{X,i} 


t=2 


t=2 

$=2 
=  9 


where  the  last  equation  follows  from  (5). 


It  is  noteworthy  that  our  choice  of  the  first  node  in  the  PRIM  algorithm  is  quite 
arbitrary.  There  are,  therefore,  at  least  n  different  Prim-numberings  of  G  which 
satisfy  Theorem  1,  and  each  yields  a  (possibly)  different  upper  bound  Z.  (There 
would  be  more  than  n  Prim-numberings  if  ties  are  encountered  during  execution  of 
PRIM). 

Theorem  1  holds  even  when  there  is  lack  of  independence  among  Xjy’s.  To 
obtain  the  distribution  of  Z,  however,  we  invoke  this  independence.  Let  the  random 
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variables  have  distribution  functions  Fwi  for  t  =  2,...,n.  Since,  Z  = 
the  DF  of  Z  is  given  by  the  (n  —  l)-fold  convolution 


where 


Fzix)  =  {Pwt  *  •  •  •  *  FW„)  (a;) 


To  illustrate  the  use  of  these  formulas,  consider  the  exponential  case 

a:>0 

and  let  Xij  be  independent.  Then, 

Fwi{x)  =  1  -  *n  =  1  -  txp{  -  \ik)x). 

k=X  t=l 

Let  A,-  =  A,fc.  Then  the  Laplace-Stieltjes  transform  of  Fwi  is  given  by 


Jo  St  a,- 


and  the  Laplace-Stieltjes  transform  of  Fz  is 

To  invert  the  transform,  we  expand  Fz{s)  as  a  pau-tial  fraction: 

=  TtV  +  7TX  ■* - +  \ 

5  +  A2  5  4-  As  s  +  A„ 

By  successively  substituting  —As,  —As,  tie.  for  s  and  solving  for  A2,  A3,  etc.,  we 


obtain 


IIA. 

A  = _ t=2 _ 

n(A.-^.) 


I  =  2, . . . ,  n 


And  now  by  the  linearity  of  the  Laplace  transform,  the  individual  terms  in  (9)  can 
be  inverted  to  yield  the  distribution  of  Z: 

=  E  /’  Ax’*"  <«  =  E 
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which  gives 


(11) 


EZ  =  '£ 


»=2 


A 

A? 


When  Xij  are  i.i.d  exponential  with  parameter  A,  (ll)  yields 


If  the  Laplace  transform  of  Fz  is  not  directly  invertible  by  inspection  as  above,  then 
an  explicit  integration  using  the  Inversion  Formula  may  have  to  be  used. 


2  Accuracy  of  the  Approximation 

Some  insight  into  the  relationship  between  ^  and  Z  can  be  obtained  from  a  Monte- 
Carlo  simulation  of  the  random  minimal  spanning  tree.  For  our  experiment,  we 
generated  graphs  with  random  arc  lengths  for  complete  and  sparse  graphs  with  i.i.d 
and  non-i.i.d  arc  lengths.  Tables  1  through  4  compare  EZ,  E^,  and  g  for  graphs  of 
various  sizes.  EZ  is  computed  for  each  value  of  n  by  using  the  formulas  of  section  3. 

is  the  average  over  1000  Monte-Carlo  realizations  of  the  random  graph  whose 
structure  (i.c.,  the  sets  N  and  j4,  and  /i,/*s,  the  mean  arc  lengths  for  arcs  in  A) 
was  chosen  prior  to  the  Monte-Carlo  step,  in  accordance  with  the  schemes  indi¬ 
cated  in  the  four  tables.  For  sparse  graphs  (Tables  1  and  3),  our  code  first  checked 
for  connectedness  of  graphs  (not  guaranteed  by  our  method  of  generating  random 
structures),  though  for  a  sparsity  factor  (i.«.,  probability  than  an  arc  exists)  of  0.1, 
we  always  got  connected  graphs. 
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Table  1:  exponentially  distributed  with  chosen  uniformly  between  1.0  and 

100.0.  Sparsity  factor  =  0.1  (I’.e.  Pr{arc  exists}  =  0.1) 


n 

EC 

EZ 

9 

40 

626.046 

791.389 

1120.296 

50 

622.346  1 

865.687 

1228.896 

60 

381.298 

566.956 

962.151 

70 

422.141 

601.960 

1107.655 

80 

380.470 

541.991 

1111.642 

90 

354.753 

532.965 

1113.422 

100 

348.155 

548.266 

1143.779 

no 

348.654 

552.688 

1231.698 

500 

274.760 

469.036 

1590.662 

Table  2:  Xij  exponentially  distributed  with  chosen  uniformly  between  1.0  and 
100.0.  Sparsity  factor  =  1.0  (i.e.  Complete  graph). 


n 

EC 

EZ 

9 

10 

34.276 

46.627 

105.640 

20 

31.642 

49.353 

137.619 

30 

28.040 

43.484 

135.129 

40 

27.191 

42.128 

136.703 

50 

26.367 

47.483 

157.473 

60 

27.955 

51.071 

173.729 

70 

27.157 

50.821 

185.730 

80 

26.064 

51.209 

187.194 

90 

27.156 

50.809 

203.407 

100 

27.432 

50.940 

222.503 

no 

26.963 

54.678 

230.275 

500 

26.140 

74.147 

518.133 
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Table  3:  i.i.d  exponential  with  /x,-,-  =  1.0.  Sparsity  factor  =  0.1 

(j.c.  Pr{arc  {i,j)  exists}  =  0.1). 


n 

Ei 

EZ 

9 

40 

16.562 

27.261 

39.0 

50 

16.172 

34.778 

49.0 

60 

13.466 

32.552 

59.0 

70 

13.646 

34.047 

69.0 

80 

13.453 

34.834 

79.0 

90 

12.476 

36.364 

89.0 

100 

12.196 

37.180 

99.0 

110 

11.908 

38.747 

109.0 

500 

12.346 

55.674 

499.0 

Table  4:  i.i.d  exponential  with  /x,,  =  1.0.  Sparsity  factor  =  1.0 

(j.e.  Complete  graph).  _ 


n 

Ei 

EZ 

9 

10 

1.258 

2.829 

9.0 

20 

1.243 

3.548 

19.0 

30 

1.240 

3.962 

29.0 

40 

1.242 

4.254 

39.0 

50 

1.222 

4.479 

49.0 

60 

1.225 

4.663 

59.0 

70 

1.241 

4.819 

69.0 

80 

1.216 

4.953 

79.0 

90 

1.222 

5.071 

89.0 

100 

1.216 

5.177 

99.0 

no 

1.228 

5.273 

109.0 

500 

1.248 

6.791 

499.0 

It  is  apparent  in  the  cases  described  in  Tables  1-4  that  the  exodic  tree  pre¬ 
forms  best  in  relatively  sparse  graphs  with  non-identical  arcs.  It  also  appears  to 
be  relatively  better  for  small  graphs.  In  each  case,  EZ  provides  a  considerable 
improvement  over  zis  an  estimate  of  E^.  Table  4  also  exhibits  convergence  of  Ei 
to  the  Frieze  [1985]  limit  /x  •  f  (3). 

In  the  remainder  of  this  section  we  develop  an  analytical  characterization  of  the 
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tightness  of  the  exodic  tree  as  an  approximation  to  the  random  minimal  spanning 
tree.  We  shall  first  establish  the  following  lower  bound. 


Proposition  2.  Let  A,^  =  mm  {Jf.y}  /or  i  =  2, . . . ,  n  and  A  =  ^,"=2  for 

every  realization  of  the  graph  G{N,A), 

C>A. 

Proof :  We  use  induction  and  the  following  obvious  facts  about  deterministic  MST’s: 

1.  Every  MST  (indeed,  every  tree)  admits  of  at  least  two  nodes  of  degree  1. 

2.  If  a  single-degree  node  and  its  (only)  associated  arc  axe  deleted  from  a  MST, 
the  remaining  arcs  form  a  MST  on  the  remaining  nodes. 

The  Proposition  trivially  holds  for  n  =  2.  Assume  it  is  true  for  every  graph 
G{N,A)  where  |iV|  =  n.  Let  G{N\A')  be  another  graph  with  \N'\  =  n  +  1.  For  a 
given  u  let  i  be  one  of  the  single-degree  nodes  on  the  MST  for  this  realization  of 
G{N',  A').  Let  j  €  N'  be  the  only  node  adjacent  to  t  on  this  MST.  Then 

where  subscripts  on  ^  indicate  the  graph  w.r.t  which  the  MST  length  is  being 
computed,  and  =  A'\{(», A)|(t,A)  €  A'}.  Since  G(W'\{»}, AJ)  is  an  n-node 
graph,  we  have  by  the  induction  hypothesis, 

n+I 

^G(Ar'.A')(w)  >  x.y(a;)  +  ^nun  ^{Xty(u;)} 

n+1 

>  min  X,y(w)  +  Y]  min  {Xt,(u;)} 

)**  i*k 

Ml 

n+1 

=  AT+'M  +  Eir’H 

ks7 

n+1 

=  ^  A|*‘^^(a;)  =  Ag!(a^»,x')(w)  I 
»=2 

The  random  variable  A  provides  a  useful  lower  bound  for  Also,  Z—A  provides 
an  easily  computed  upper  bound  on  the  difference  between  Z  and  f .  For  instance. 


it  is  readily  seen  that  for  the  graphs  summarized  in  Tables  3  and  4,  the  value  of  EA 
is  10  and  1  respectively.  This  eliminates  the  possibility  that  approaches  zero  as 
the  number  of  nodes  in  the  graph  grows  without  limit. 

For  notational  convenience,  we  denote  max{x,0}  as  ar"*"  for  real  a:,  and  define 

min{Xi,<+i, . . . , X,-„}  if  2  <  t  <  n  —  1 
oo  if  t  =  n 

then,  recalling  that  Wi  =  min{Jf,i,. . . 

A  =  =  Emin{lVi,V;} 


v;  = 


i=2 


t=2 

n 


«=2 


•=2 


Hence 


Z  -  A  =  £(IV,  -  V,)* 

»=2 

It  is  then  clear  from  (13)  that  the  necessary  and  sufficient  condition  for 

EA  =  Ei  =  EZ 


(13) 


is 

Vi  >  Wi  w.p.  1  Vt  =  2, . . . ,  n. 

Thus,  a  node  numbering  tt  of  G  is  “optimal”  (i.e.,  the  upper  and  lower  bounds 
on  E^  with  respect  to  tt  coalesce  into  EC)  if  the  above  dominzmce  condition  is 
satisfied  for  tt.  This  dominance  condition,  however,  b  not  only  difficult  to  verify, 
it  may  well  not  exist  at  all  among  any  of  the  n!  numberings  of  G  (as  is  evident  for 
my  graph  whose  arc-lengths  are  distributed  over  the  the  same  range).  One  simple 
situation  in  which  the  dominance  condition  does  hold  is  when  one  node  is  closer  to 
all  other  nodes  than  they  are  to  each  other.  Suppose,  for  example,  that  for  any  t,y, 
P{Xu  <  Xij}  =  1.  In  this  case  the  minimal  spanning  tree  consists  of  the  familiar 
“hub  and  spoke”  pattern,  t.e.,  the  arcs  (l,i),  j  =  2,...,n.  Clearly,  however,  the 
coalescence  of  A,  and  Z  involves  very  special  circumstances. 


3  Asymptotic  Optimality 

Two  questions  arise  naturally  from  our  discussion  of  approximations  for  the  random 

\ 

MST.  First,  how  well  do  the  bounds  perform  for  large  graphs?  Second,  what  are 
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the  asymptotic  properties  of  the  random  MST?  Our  next  result  shows  that  under 
appropriate  conditions  the  relative  error  induced  by  using  the  exodic  tree  in  place 
of  the  MST  tends  to  zero  as  the  number  of  nodes  in  G  grows  large. 

In  order  to  characterize  the  growth  of  graphs,  we  use  the  “incremental”  model  of 
Weide  [1978].  Consider  a  sequence  of  graphs  Gi,  G^,  ...,  where  =  (JV„,A„), 
Nn,  An  are  respectively  the  node  and  arc  sets  of  and  |IV„|  =  n.  Let  X,"  denote 
the  arc  length  random  variable  for  t,y  €  Nn  and  (i,j)  G  The  incremental 
growth  model  assumes  Gi  C  Gj  C  •  •  •  in  the  sense  that  X,"  =  =  •  •  •  for  each 

(i.y)  G  An.  Thus,  the  graph  G„+i  is  identical  to  the  graph  G„  save  for  the  addition 
of  node  n.  +  1  and  arcs  incident  to  it. 

To  denote  the  functional  dependence  on  the  number  of  nodes,  we  shall  henceforth 
annotate  our  symbols  with  n.  Thus 


and 


^n=mn{  5;  X^J 


1=2 


»=2 


Theorem  3.  Let  {Gn}  be  a  sequence  of  complete  graphs  growing  according  to  the 
incremental  model.  Let  X^  be  independent  non-negative  random  variables  with 
distribution  functions  having  a  positive  lower  support  for  all  (i,j)  G  An, 
i.e.,  3tt  >  0  such  that  )  =  0  V(t,y)  G  An,  and  3  a  distribution  function  F» 

such  that 


1.  i^«(x)  <  Fiji{x)  Vx  and  V(t,y)  G  An,  and 

2.  F,(a  +  c)  >  0  Ve  >  0. 


Then  as  n  -►  00, 


(a) 


a.s. 


If,  in  addition,  £(sup  min  {X"  })  <  00,  then 

n>2  ' 


(b) 

(c) 


,  and 
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Proof  :  Since  )  =  0  V(t,j)  €  i4„,  we  have 

XJ)  >  a  w.p.  1  V(»,j)  €  An- 


Hence  Cn>  l)«  w.p.  1,  and 


l<|=<Sk^  „.p.x 

{n  (n  -  l)o 


(14) 


Now,  Ve  >  0, 


P{Wl  >  a  +  £}  =  n(i-  Fij[a  +  e))  ^  (l  —  ■F»(o  +  £)) 

i=i 


Hence, 

f; P{W‘  >a  +  e]  <'£{l-F.{a  +  £))'-'  <  co 

4=2  »=2 

since  F,{a  +  e)  >  0;  and  by  the  Borel-CantelU  Lemma, 

P{W;  >  a -he  lo.}  =  0  ».e., 

Wf  a  a.8. 

Therefore  the  Cesaro  sum  also  converges  ,  i.e.,  as  n  -+  oo 


(15) 


n-1 


a.s. 


Now,  the  incremental  growth  of  ensures  that  for  any  n  >  t,  IVf  =  Wp,  so  we 
have  from  (14) 


1  a.s. 
in 


To  prove  (b),  observe  that 


E 


Zn 
sup  — 
Ln>J  Cn 


<  E 

<  E 

=  E 

<  oo 


Ln>2  (n  -  l)a 
(n  -  1)  supoj  Wi 


Zn 


(n  -  l)o 

-  sup  min  {X,  } 

o  ,->2 
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Hence  by  Lebesgue  Dominated  Convergence  Theorem,  part  (a)  implies 


Similarly,  since  fJ(supW”)  =  £(sup  min  {Xn,})  <  oo.  we  again  have  by  Lebesgue 

n>2  n>3 

DCT  that 


EW: 


Then  (c)  follows  by  the  Ces^o  argument  since 

^  EWi  +  ^^^  +  EW- 
~  Ein  ~  (n  -  l)a 


The  convergence  results  of  Theorem  3  can  be  established  for  sparse  graphs  as 
well,  if  we  ensure  that  — »  a  a.s.  The  following  result  extends  Theorem  3  to 
incomplete  graphs.  Define  the  in-degree  of  a  node  t  in  Gn  to  be  the  nximber  of 
lower-numbered  nodes  connected  to  i.  i.e.,  let  (^(i)  =  2  <  t  <  n. 

Again,  the  incremental  growth  of  ensures  that  rf'(t)  =  d’(i)  Vn  >  i. 

Theorem  3'.  Theorem  3  holds  for  incomplete  graphs  if  there  exist  positive  con¬ 
stants  6,'Kx,  and  N  such  that  n>  N  implies 

<r(n)  >  Ki  +  iifsilog  n  -h  (1  +  5)  log  log  n]. 

Proof:  Let  /?,  =  (!—  F,{a  + «:)).  Then  the  inequality  in  (15)  reduces  to 

n=2  ns2 

Since  <  oo  for  all  >  0,  we  have  by  the  comparison  test  that 

converges  if  there  exist  positive  constants  Ci,  N  such  that  n>  N  implies 

1 

^  ^n(logn)^+*‘ 

Writing  log  =  — C2,  with  Cj  >  0,  the  condition  for  convergence  is  given  as 
<r(n)  >  +  ;p^[log  »  +  (1  +  5)  log  log  n]. 
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Writing  Kj  =  (logCi/  —  C2)  and  =  (I/C2)  yields  the  desired  condition.  ■ 


It  is  of  interest  to  note  that  under  the  growth  conditions  of  Theorem  3, 

nrr,  °^tn.,-  EXjj 

iim  infn— »oo  171  >  ^ 

Ein  a 

Thus,  for  instance,  when  are  i.i.d  with  a  shifted  exponential,  i.c.. 


{0  X  <  a 

1  —  x>  a 


then  QnjE^n  remains  bounded  below  by  1  +  ^,  whereas 

1  <  -  1)^  +  A  -  1)  +  1  ^  log(n  -  1)  +  1 

“  E^n  ~  EAn  ~  {n  —  l)a  X{n  —  l)a 

which  indicates  rapid  convergence  of  EZ^jEin  to  1.  For  example  if  a  =  1,  A  =  ^, 
and  n  =  500,  then  EZn  overestimates  E^n  by  at  most  15%. 


4  Application  to  a  Network  Provisioning  Model 

Consider  the  task  of  costructing  a  communications  network  with  the  MST  topology 
to  interlink  a  given  configuration  of  nodes.  The  cost  of  the  connecting  cable  (e.jr., 
a  coaxial  fiber  optic  cable)  and  its  installation  charge  per  unit  length  are  deemed 
substantially  high;  and  at  the  planning  stage  the  length  of  cable  needed  to  link  a 
pair  of  nodes  is  not  known  with  certainty  (due  to  imcertainties  about  the  exact 
path  to  be  taken,  wastages,  etc.].  The  planner  needs  to  place  an  order  for  the 
total  length  of  cable  required  for  the  network  based  on  probabilistic  information 
about  the  inter-node  distances,  and  the  "true”  distances  (upon  which  the  MST 
configuration  is  based)  become  known  only  later  at  the  implementation  stage.  If 
the  ordered  length  of  cable  falls  short  of  the  true  requirement,  then  a  supplemental 
order  must  be  placed  at  a  higher  unit  cost,  possibly  also  accompanied  by  a  fixed 
ordering  cost.  Contrariwise,  the  surplus  length  of  cable  can  be  disposed  of  at  some 
salvage  value.  The  decision  problem  is  to  determine  the  optimal  quantity  of  cable 
to  be  ordered  at  the  first  stage  so  as  to  minimize  the  total  expected  cost  of  cable 
needed  for  the  network.  This  problem  is  an  example  of  the  classical  single-period 
"newsboy”  problem.  The  difficulty  lies  in  characterizing  the  "demand”  distribution. 
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Let  X  be  this  order  quantity  and  let  ci,  cj  be  the  unit  costs  of  cable  in  stage  1 
and  stage  2  respectively.  Let  s  be  the  salvage  value  per  unit  length  of  cable,  and 
K  the  fixed  ordering  cost  should  a  supplemental  order  be  needed  in  stage  2.  Let 
0  <  5  <  Cl  <  C2,  and  K  >0. 

As  before,  G  =  (iV,  A)  represents  the  graph  with  \N\  =  n,  and  the  arc-lengths 
Xij's  are  independent  random  variables  with  DF’s  Fij's.  Then  the  total  cost  of 
cable  needed  for  the  network  is  given  by 


x)  =  CiX  +  CiUn  “  -  5(2:  -  (16) 

whence 

x)  =  cix  +  C2  [  {t-x)  dF(^{t)  +  K{1  -  i^e„(x))  -s  [  {x-t)  dF(„(t)  (17) 
Jx  Jo 

where  F(^  is  the  DF  of  the  MST  length.  If  has  density  /{„ ,  then  the  first-order 
condition  requires  that  the  optimal  order  satisfy  the  equation 

(Cj  -  (x)  -Kf(,=C2-  Cl  (18) 

In  case  there  is  no  fixed  cost  for  the  supplemental  order,  then,  the  optimal  order 
quantity  xj^  is  the  (^5iE^)th  fractile  of  F(„. 

Since  F(„  is  analytically  intractable,  we  solve  a  surrogate  problem  of  minimizing 
EC{Zn,x),  the  expected  total  cost  for  the  exodic  tree,  and  obtain  the  optimal  order 
quantity  x^^  by  solving 

(cj  -  s)Fz„  (x)  -  Kfz„  (x)  =  C2  -  Cl 

Again,  if  if  =  0,  x^^  is  obtained  as  the  fractile  of  Fz„  which  can  be 

computed  as  in  Section  1.  Notice  that  since  <  Zn  w.p.  1,  we  have  F^^[x)  > 
Fz„{x)  Vx,  and  hence  xj^  <  x^^. 

The  remainder  of  this  section  is  devoted  to  characterizing  the  “goodness”  of 
our  approximation  for  the  provisioning  problem.  Assume  henceforth  that  G{N,  A) 
satisfies  the  conditions  of  Theorem  3. 

Our  first  result  states  that  for  any  size  x  of  the  order,  the  surrogate  problem  provides 
an  asymptotically  correct  approximation  to  the  total  cost. 
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Proposition  4. 


!• _  C'(^n>3f)  ,  I  • 

- T  =  1  ft-s-  uniformly  in  x 

Proof:  Since  <  {f  —  y)"*"  for  all  functions  /  and  y,  we  have  from  (16) 

C(Zn,x)  _  J  ^  C2(Zn  -  "  ^n)'*'  +  K(l(z„>x}  ~  !{{„>«}) 

^(^ntX)  CiX  +  C3((n  l)"*"  +  ■^^{(a>*}  ~ 

Now,  since  5  <  Ci  <  Cj, 

ClCn  <  CiZ  +  C2(^n-x)'^  +  if !{{„>,}  -s(x-  ^n)"^  a.S.  Vx 

Therefore 

=  ^  +  ■^-  a.S.  because  Zn  >  Cn  a.s. 

Since  ♦  1  a.s  by  Theorem  3(a),  and  oo  a.8.,  the  Proposition  follows  by 

taking  limits  as  n  — >  oo  and  noting  that  the  right-hand  side  is  independent  of  x.  I 


We  require  some  technical  lemmas  in  order  to  obtain  our  main  result  on  the 
asymptotic  precision  of  the  exodic  tree  as  a  surrogate  for  the  MST. 

Lemma  5. 

EC{Z„,x)  ,  T  /  • 

I™  — r  =  1  uniformly  m  x 

"-oe  EC((n,x) 

Proof:  As  in  Proposition  4,  s  <  Ci  <  cj  =>  Cif„  <  C(f„,z)  a.s.  Vz.  Hence, 

eiE(„<  EC{(„,x)  Vz. 


Now  from  (17), 

EC{Zn,x)  C2E{Zn  -  +  sE{Z„  -  en)-^  +  K{P{Z„  >  z}  -  >  x}) 

EC{^n,x)  -  CiEU 

Again,  since  Zn  >  a.s., 

EC{Zn,x)  _  ,  ^  Cs  ~t~  a  EZn  _  j 
ECUn,x)  -  Cj  [jS?e„  J  Cl£?en 

Application  of  Theorem  3(c)  as  n  — »  oo  then  completes  the  proof.  I 
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Lemma  6.  Let  {fn),  (fl'n)  be  sequences  of  real-valued  functions  having  global  min¬ 
ima  at  points  (x*),  (y*)  respectively.  If 

lim  =  1  uniformly  in  u, 

<7n{u) 


then 


lim  =  1. 


Proof :  By  the  optimality  of  x*  and  y* 


MK)  ^  fnK)  /„(»;) 

s.(K)  s.M  ««(y;) 


(19) 


and  by  the  assumed  uniform  convergence,  the  left  and  right  sides  of  (19)  both 
converge  to  1.  I 


The  main  result  can  now  be  established. 

Proposition  7.  Let  there  exist  xj^  and  such  that 


EC((„,xl)  =  mmEC((„,x) 


and 


Then 


£C(Z.,xiJ  =  minSC(Z„,x) 


,  ,  SC(Z,,  xiJ  , 
W  Jim  1111^  =  1. 

n-*oo  £C{^n,3;(„) 


Proof:  For  definiteness,  when  x^^  and  x|^  are  not  unique,  we  take  the  smallest 
among  such  values.  Then  (a)  is  a  dir^t  application  of  Lemmas  5  and  6. 

To  prove  (b),  first  observe  that 


~  Cn)  ^  >  fn  ^  ® 

■  Ca(Z„  -x)  +  K  +  8{x-  ^n)  if  Zn>  X>  Cn 

a(2^n  ~  fn)  if  ®  ^  >  fn 
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Therefore,  since  w.p.  1,  we  have  by  Theorem  1 

<C{ZnyX)  Vz  w.p.  1 

and  hence 

EC{^r„x)<EC{Zn,x)  Vz  (20) 

Notice  that 

,  £C({„.I^.)  £C(z„,x;.) 

-  £C(«„,iJ,)  -  £C(£„,*;J 

where  the  first  inequality  follows  from  the  optimality  of  zj^  for  EC{^n,x),  and  the 
second  from  (20).  Now  by  taking  limits  as  n  — ^  oo  and  using  part  (a),  we  get  the 
desired  convergence.  I 

Proposition  7  may  be  interpreted  in  the  following  manner.  For  each  instance  of 
the  problem,  the  MST  network  is  constructed  according  to  the  “true”  arc-lengths 
that  become  known  at  the  implementation  stage.  Therefore,  the  expected  total 
cost  function  that  the  planner  really  faces  is  .BC($„,z),  which  by  definition  is  mini¬ 
mized  at  Zj^.  Then  EC^^nyX^J  is  the  expected  total  cost  the  planner  will  incur  by 
using  x*2^,  the  optimal  order  quantity  computed  for  the  surrogate  problem.  Propo¬ 
sition  7(b)  asserts  that  the  proportional  error  in  optimal  expected  cost,  induced  by 
using  the  surrogate  problem,  tends  to  zero.  This  approach  to  approximating  the 
provisioning  problem  is  similar  in  spirit  to  that  of  Dempster  et  al.  [1983]  in  their 
analysis  of  hierarchical  scheduling  problems. 

5  Concluding  Remarks  and  Open  Questions 

It  seems  likely  that  the  conclusions  of  Theorem  3  will  hold  under  much  weaker 
conditions.  One  generalization  would  be  to  establish  these  results  under  the  more 
general  “independent”  growth  model  for  random  graphs  (Weide  [1978]).  It  may 
also  be  possible  to  relax  the  requirement  of  a  uniform  positive  lower  support  a 
for  all  Fij's,  and  prove  the  theorem  for  positive,  but  not  necessarily  equal,  lower 
supports  Oij's  for  Fiy’s  respectively.  However,  we  would  not  be  able  to  do  away 
entirely  with  positive  supports  since  we  know  from  Frieze’s  [1985]  result  that  in  the 
absence  of  these,  the  MST  length  converges  to  a  limit  for  complete  graphs  with 
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i.i.d  arc  lengths.  In  particular,  for  complete  graphs  with  i.i.d.  exponential  (A)  arc 
lengths,  E^n  and  from  (12),  EZn  ~  j  logn. 

One  of  the  analytical  difliculties  in  strengthening  Theorem  3  lies  in  not  having 
tight  lower  bounds  that  we  can  exploit.  We  use  A  in  section  2  to  examine  how  node 
numbering  affects  the  “goodness”  of  EZ,  and  work  with  (n  —  l)a  in  Theorem  3,  but 
none  of  these  is  very  tight.  A  tighter  lower  bound  to  ^  is  =  232=1  the  sum 
of  the  first  n  —  1  order  statistics  of  Xt^-’s,  and  it  remains  an  open  question  to  see  if 
this  or  any  other  lower  bound  might  enable  a  generalization  of  our  results. 

Acknowledgement:  We  are  grateful  to  an  anonymous  referee  for  simplifying 
an  earlier  proof  of  Proposition  2. 
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